Skip to content

GEODE-6973: Use cachelistener to synchronize typeToId with IdToType#4014

Merged
gesterzhou merged 17 commits intoapache:developfrom
DonalEvans:feature/GEODE-6973-2
Sep 10, 2019
Merged

GEODE-6973: Use cachelistener to synchronize typeToId with IdToType#4014
gesterzhou merged 17 commits intoapache:developfrom
DonalEvans:feature/GEODE-6973-2

Conversation

@gesterzhou
Copy link
Copy Markdown
Contributor

    Co-authored-by: Xiaojian Zhou <gzhou@pivotal.io>
    Co-authored-by: Donal Evans <doevans@pivotal.io>

Thank you for submitting a contribution to Apache Geode.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?

  • Has your PR been rebased against the latest commit within the target branch (typically develop)?

  • Is your initial contribution a single, squashed commit?

  • Does gradlew build run cleanly?

  • Have you written or updated unit tests to verify your changes?

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

Note:

Please ensure that once the PR is submitted, check Concourse for build issues and
submit an update to your PR as soon as possible. If you need help, please send an
email to dev@geode.apache.org.

        Co-authored-by: Xiaojian Zhou <gzhou@pivotal.io>
        Co-authored-by: Donal Evans <doevans@pivotal.io>
Donal Evans and others added 4 commits September 6, 2019 17:00
Co-authored-by: Donal Evans <doevans@pivotal.io>
Co-authored-by: Xiaojian Zhou <gzhou@pivotal.io>
return newType.getTypeId();
} finally {
// flush the tmpTypeToId map for who introduced this new pdxType
if (!tmpTypeToId.isEmpty()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this finally needed? How would the tmpTypeToId not already be empty here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have discussed that, it's safe to do it when the creator holds the d-lock.
The listener will only add into tmpTypeToId map now. We use it to make sure that the distribution should all finished before flushing anything into typeToId map (they stayed in tmpTypeToId map now)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But wouldn't the line above (line 386) when we actually use the map, have cleared it? Why do we need to reclear in the finally? Why is the duplicated code needed in the finally block? Same goes for the other method...

Copy link
Copy Markdown
Contributor

@bschuchardt bschuchardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use variable names that don't begin with "tmp" and are more descriptive of what you're accomplishing? It seems like you're staging additions to other collections so maybe "pending" instead of "tmp" would be better.

@gesterzhou
Copy link
Copy Markdown
Contributor Author

I changed the map's name based on bruce's suggestion.

@gesterzhou gesterzhou dismissed bschuchardt’s stale review September 9, 2019 18:38

The map name is changed to "pending"

@gesterzhou gesterzhou requested a review from jhuynh1 September 9, 2019 18:38
Copy link
Copy Markdown
Contributor

@bschuchardt bschuchardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry but I find this whole change set difficult to follow and think it would be pretty much impossible for a new developer to figure out. Is there any way to make this be less piecemeal, with one method throwing stuff into a collection that it's "understood" that another method will consume under some lock/synchronization? PDX already has confusing caches and this just seems to exacerbate the situation.

Co-authored-by: Donal Evans <doevans@pivotal.io>
Co-authored-by: Xiaojian Zhou <gzhou@pivotal.io>
@gesterzhou gesterzhou closed this Sep 9, 2019
@gesterzhou gesterzhou reopened this Sep 9, 2019
Authored-by: Donal Evans <doevans@pivotal.io>
@upthewaterspout
Copy link
Copy Markdown
Contributor

@bschuchardt - To make your suggestion more concrete - would it help to move pendingTypeToId and typeToId to a separate data structure that encapsulates that functionality?

IMO this whole pendingTypeToId business is essentially working around a limitation in our transactions. I would expect that if a CacheListener fires in Member A as a result of a transaction, then a get in Member B after that CacheListener event should see the change. That is unfortunately not the case.

Authored-by: Donal Evans <doevans@pivotal.io>
Copy link
Copy Markdown
Contributor

@upthewaterspout upthewaterspout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did all of this sample json come from? It has email address, urls in it?

How likely is it for some of these new tests to be flaky? I see things like toleratedCollisionFraction=0.01f? Is this a test with a lot of random numbers that may occasionally fail? Can we at least lock it down with a seed so that it either passes or fails consistently?

@bschuchardt
Copy link
Copy Markdown
Contributor

@upthewaterspout yes, that would help. It's like we're writing to a structure but don't want to commit the changes in the current thread. The changes need to be committed in another thread. Encapsulating the collections in another object that requires that the correct locks are obtained before transferring from pending to committed would go a long way toward making these changes more understandable without hearing Gester's presentation on the solution.

Authored-by: Donal Evans <doevans@pivotal.io>
@DonalEvans
Copy link
Copy Markdown
Contributor

@upthewaterspout

I removed those tests. You're right, they were potentially flaky and not testing any behaviour that wasn't accounted for elsewhere. As for where the sample JSON came from, it was just an edited version of an existing JSON resource used in other tests.

@gesterzhou
Copy link
Copy Markdown
Contributor Author

The testJSON.txt is from ./geode-core/src/distributedTest/resources/org/apache/geode/pdx/jsonStrings/json4.txt.

We can change it to dummy url and email.

@gesterzhou
Copy link
Copy Markdown
Contributor Author

gesterzhou commented Sep 10, 2019

I will create an inner class LocalReverseMap to wrap the typeToId, enumToId, pendingTypeToId, pendingEnumToId.

@gesterzhou
Copy link
Copy Markdown
Contributor Author

I created an inner class to wrap all these local maps. Pls review again.

Copy link
Copy Markdown
Contributor

@bschuchardt bschuchardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's much better - thank you

Donal Evans added 2 commits September 10, 2019 09:12
Authored-by: Donal Evans <doevans@pivotal.io>
Authored-by: Donal Evans <doevans@pivotal.io>
@gesterzhou gesterzhou merged commit cdca788 into apache:develop Sep 10, 2019
gesterzhou added a commit that referenced this pull request Sep 10, 2019
gesterzhou added a commit that referenced this pull request Sep 10, 2019
…ToType (#4014)"

This reverts commit cdca788.
Click too fast and the message is messy. Will re-merge.
@gesterzhou gesterzhou deleted the feature/GEODE-6973-2 branch September 11, 2019 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants